2016/04/28

Introduction

How do we move towards a more data driven decision making for conservation & management?

We have lots of data

We have lots of models

So what could go wrong?

Outline

Part I: Making the data fit the models

  • Model here, data there
  • Long tail data
  • Scaling

Part II: Making the model fit the data

  • Decision Theory
  • POMDPs

Part I: Ecoinformatics – making the data fit the models

Expanding sources of data

Remote Sensors

Microsensors

NEON

Ocean Observatory Initiative

Data publication

Model here, data there

Data access

  • Online databases written for humans but not computers
  • Online databases written for computers but not humans
  • Written for the wrong humans

rOpenSci

rOpenSci

rOpenSci

Not written for computers

Not written for humans

Data written for the wrong humans?

APIs

rOpenSci Packages

rOpenSci data packages aim to be:

Written for humans

  • Provide a simple and logical query language for accessing data

Written for computers

  • Provide a programmatic interface to the data: It should be possible to automate repetitive data access tasks.

Format friendly

  • Provide the data in formats where modeling can be most readily applied (data.frames instead of XML, format exchange methods)

Long tail data

Beyond databases: the long tail of academic data

Long tail data

Vertically Integrated Repositories =

Metadata Repository = Data Lake

Metadata Repository

rOpenSci data publication tools aim to:

Workflow

Data Synthesis

Semantics

Long tail data

  • Data discovery
  • Data management
  • Data synthesis
  • Data publication

Scaling

Data doesn't fit the computer

  • Most computation is still local
  • Scaling beyond the laptop usually involves complete re-tooling (and to far less friendly tools)
  • Doesn't have to be this way

Rocker

Jetstream

Part II: Optimal control – making the model fit the data

Methods ignore data realities

  • missing data
  • non-random bias
  • data heterogeneity
  • non-stationarity
  • imperfect observations

Today, we'll focus on just one:

  • imperfect observations

Consider optimal control problems:

  • Practical application, clear objectives
  • Decision theory: optimal control of stochastic systems …
  • … but typically still requires strong assumptions

Why Optimal Control?

  1. Cannot have an answer without a question
  2. Distinguish between what we need to know and what doesn't matter
  3. Consider "actions" as well as "states"

Sumatran Tiger

Sumatran Tiger

Spartina

Lambert et al (2014) Science

Spartina

Fisheries

Fisheries

Markov Decision Processes

Partially Observed Markov Decision Processes

Decision Matrix

Decision Matrix

Decision Matrix

Decision Matrix

Decision Matrix

State equation

  • If we want to make decisions about the future, we need a predictive model

\[ \vec X_{t+1} = f(\vec X_t, \vec a_t, \xi) \]

Decision Matrix

POMDP Algorithms

  • Monahan's Enumeration Algorithm (1971, 1982)
  • Sondik's One-Pass Algorithm (1971)
  • Cheng's Linear Support Algorithm (1988)
  • Littman, et al's Witness Algorithm (1994)
  • Zhang's Incremental Pruning Algorithm, (1996)

Modern POMDP Algorithms

  • PBVI (Pineau et al 2003)
  • Perseus (Spaan, & Vlassis 2005)
  • HSVI (Smith, & Simmons 2005)
  • GapMin (Poupart et al 2011)
  • PEMA (Pineau & Gordon 2005)
  • FSVI (Shani et al 2008)
  • SARSOP (Kurniawati et al 2008)

Does partial observability matter?

Classic, fully-observed solution:

POMDP solutions

Impact on management

Acknowledgements

Collaborators:

  • Milad Memarzadeh
  • Karthik Ram
  • Scott Chamberlain
  • Matt Jones

Funding:

  • NSF, Helmsley
  • ESPM